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1. INTRODUCTION 

Diagnosis of Islamic banking models that are at risk of bankruptcy through financial analysis is 
fundamental in protecting financial difficulties [1], [2]. Building a stable and precise model to predict the 
bankruptcy of a banking company is very important as the basis for a prudent approach [3], [4]. Altman 
Z-Score is used as the main model for bankruptcy prediction [5]-[7]. Researchers and practitioners try to 
apply effective methods to build a system capable of predicting corporate bankruptcy, based on various 
modeling techniques. Machine learning methods offer an automatic and objective way to achieve a high level 
of prediction for this task [8]. Bankruptcy risk triggered by financial distress can be faced by various 
industrial sectors, including the banking sector. This risk requires every bank to try to maintain its health 
condition in order to survive the various shocks that occur. Bank Indonesia as a regulator also continues to 
implement various policies related to bank health that need to be maintained. The performance of Islamic 
banks is monitored and evaluated from various perspectives using financial ratio performance indicators. 
Financial ratio indicators are an important factor in assessing the company’s sustainability performance. The 
financial ratio indicator describes the improvement in the performance of the banking financial position 
because this indicator is built from the company’s annual financial report. The performance of good financial 
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ratios will determine the sustainability of the company’s industry [9]. One approach in predicting financial 
distress using financial ratios is the Altman Z-score. In 2000, Altman developed a modified Z-Score formula 
that can be used in banking companies using only 4 financial ratios [10]. 

There is a need to design an effective strategy for predicting financial distress based on financial 
ratios, therefore machine learning models as an alternative model are widely used to develop models 
predictive in finance [11]—[13]. The support vector machine (SVM) method is used in analyzing bankruptcy, 
where financial ratios are used as indicators of analysis of information sources in improving classification 
[14]. The random forest method randomly selects a characteristic subset of each tree node, avoiding 
correlation in the bootstrap set [15]. The use of the logistic regression algorithm (LR) as an individual model 
can be used as an alternative to assess credit risk because it can produce values in the range [0,1] as the 
probability of observing certain objects [16]. The technique of combining multiple predictive models has 
long been investigated by researchers, ensemble techniques have been shown to outperform with the best 
degree of accuracy of all individual models [17]-[19]. The ensemble method approach has the potential to 
produce the best solution in the classification model because this method combines several basic algorithm 
methods whose final results are taken from the best voting so that in some cases their performance can 
outperform the results of the one basic algorithm applied. Researchers have carried out good performance 
with ensemble learning [20]-[22]. An ensemble learning approach has been used for bankruptcy prediction 
[23], while other researchers have combined neural networks with stock price prediction ensembles [24]. 
Ensemble constraints that need to be addressed are a combination of variations in a single model that must 
match, a method to get the output from a single model that is a member of the ensemble, how to train every 
single model. The ensemble performs better than a single model in terms of predictive power [4], [25]. 

The strength of this research is the successful configuration of the modified ensemble bagging 
model. The over-sampling method is used to transform the unbalanced financial ratio dataset into a balanced 
one. Then the grid search method is applied for optimization of hyperparameter multi-layer perceptron 
(MLP) which is integrated into bagging. Where the over-sampling feature resamples bootstrap sample data to 
explicitly change the unbalanced class. We can then evaluate this model using the confusion matrix and the 
area under the curve (AUC). 

The research objective to be achieved in this research is to build a robust and stable model in the 
classification of financial datasets by looking at the level of accuracy, precision, recall, Fl-score, and AUC. 
An ensemble approach to optimization of hyperparameter MLP integrated with bagging is proposed and 
applied in this research. While the logistic regression algorithm, support vector machine, and random forest 
are comparisons. The programming language tool used is python. The benefits of implementing machine 
learning in the financial sector can be used by shareholders, management, Bank Indonesia, and policymakers 
for monitoring and evaluating banking performance based on the nature of risks related to financing. 


2. METHOD 

The population used in this study is the financial ratio of the banking industry in Indonesia. 
Purposive sampling is a data collection method that is applied, where the data collection method uses certain 
considerations. Sampling on Islamic banks that publish financial statements for the 2010-2016 period which 
is available on the official website of each bank concerned or through the website of the financial services 
authority. We choose features as independent or predictive variables based on the study [3], the data obtained 
are then processed using a financial ratio calculation approach that forms a financial ratio dataset. Where the 
working model is to total assets (WCTA), retained earnings to total assets (RETA), earnings before interest 
and taxes to total assets (EBITTA), and book value income to total book value debt (BVEBVTD). This 
financial feature is paired with three existing targets or classes, namely distress zone, gray zone, and safe 
zone. The training dataset can see in Table | training dataset. 


Table 1. The training dataset 


Id WCTA RETA EBITA BVEBVTD Z-Score Target 

1 0.5651 0.6760 0,0728 0.0710 0.7762 Distress Zone 
2 5.4376 0.1144 0.0610 0.2961 5.9081 Safe Zone 
3 0.0024 0.0114 0.0177 0.0813 0.0852 Distress Zone 
4+ 1.8000 1.5000 0.7200 1.4300 5.7425 Safe Zone 
5 3.5900 1.6250 0.6100 2.6800 2.1127 Gray Zone 


The process of learning and testing the model starts from dividing the dataset randomly into two 
parts. The first part is 70% for the training dataset, while the remaining 30% is for the test dataset. After the 
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dataset is divided into two parts, the training data section, and the data testing section, different learning 
algorithms are used in the training data. The model can achieve optimal performance by using various 
classification algorithms at the training stage, the confusion matrix is used for performance evaluation in a 
valid model. To improve the performance of the model, several techniques used are feature engineering, 
ensemble learning methods, parameter settings in the algorithm. Engineering is done by examining each 
feature in the data set and investigating its relevance to the target. 

Through feature engineering, we get the financial Altman Z-Score ratio which can affect the 
prediction accuracy of the model. We used a modified ensemble method to produce a better-performing 
model. Logistic regression classification (LRC), SVM, random forest (RF), MLP are the basic algorithms 
applied. An ensemble bagging model that integrates LR, SVM, RF, and MLP. The new bagging ensemble 
learning models (BELM). When modeling BELM, we adjust the configuration of the dataset with the 
oversampling method to improve the quality of the dataset, until we get the best settings to optimize model 
performance with optimization of the MLP hyperparameters integrated with the bagging model. The BELM 
model was validated and evaluated using obfuscation metrics and AUC to obtain the training algorithm and 
the distribution of the validation set error. The validation strategy uses the level of accuracy, precision, recall, 
Fl-score. Thus, the framework that we will propose in this paper is shown in Figure 1. Bankruptcy prediction 
optimization classification model ensemble. The study process shown in Figure 1 is grouped into three 
phases, namely: i) the data preprocessing phase, ii) development of a multi-class classifier model, and iii) 
comparing accuracy and confusion matrices. We discuss each phase in detail in the following sections. 


DATASET 
FINANCIAL RATIO 
DATA 
PRE- 
PROCESSING OVER SAMPLING 
METHOD 
LEARNING SCHEMA ip » 
LEARNING ALGORITHM EROCESSED PROCESSED 
TRAINING TESTING 
DATA DATA 


TESTING AND CONFUSION 
VALIDATION MATRIC, AUC 


MODEL 
COMPARATION 
LEARNING 
META MODELS 
CLASSIFER 


Figure 1. The machine learning model of bankruptcy prediction that is applied 


3. RESULTS AND DISCUSSION 
Result of analysis and design of model implementation for optimization of machine learning 
ensemble algorithm on bank bankruptcy prediction. The implementation stage requires elements that support 
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the success of the model built for bankruptcy prediction. The elements needed are hardware and software. 
Hardware uses standard laptop RAM 4 GB, 64-bit, 2.4 GHz, CPU Core i3-4000M. While the software is 
Windows 10 Pro 64-bit as an operating system with Python software, version 3.7, Jupiter notebook, Graph 
library, NumPy, Pandas, Matplotlib, Scikit-learn, H2O, Mlextend, as well as an over-sampling approach to 
handling data sets to become balanced data set. 


3.1. Logistic regression classification (LRC) model 

Logistic regression has the potential to predict output on target, where the target variable is 
categorical or continuous [23], [24]. The LRC model during the learning process will find a model called the 
sigmoid function or logistic function that can be applied to a multi-class financial ratio dataset by using pairs 
of four input features in the form of financial ratios working capital to total assets (WCTA), rretun earning to 
total assets (RETA), earnings before interest and taxes to total assets (EBITA), book value equity of book 
value to total debt (BVEBVTD), and Z-score features and target classification with the classification of the 
danger zone, gray zone, and safe zone. Table 2 shows the results of the LR model. 

Referring to Table 2. The results of the LR model evaluation on the multi-class classification target, 
information was obtained that the average precision level was 66%, the recall was 81%, and Fl-score was 
72%. Meanwhile, the results of the statistical evaluation of the accuracy ratio show that individual models 
have an average predictive accuracy of 81%. The results that are evaluated are indicators related to the 
classification of applying LR to the training data resulting in a fairly high level of accuracy, while the testing 
data produces a classification accuracy rate of 81%. This shows the level of accuracy on the LR model is 
good and not overfit. 


Table 2. LRC model evaluation results 


Precision Recall F1-Score Suport 
Distress Zone 0.00 0.00 0.00 1 
Grey Zone 0.00 0.00 0.00 3 
Safe Zone 0.81 1.00 0.89 17 
Accuracy 0.81 17 
Macro AVG 0.27 0.33 0.30 21 
Weighted AVG 0.66 0.81 0.72 21 


3.2. Support vector machine (SVM) model 

The SVM model is a classification method that applies statistical learning theory and is trained 
using learning algorithms based on optimization theory and can provide good classification results, map the 
original data to another higher dimension by applying kernel techniques, and has the concept of separating 
two classes and trying to find the best hyperplane [25]. The datasets will be separated, meaning that the SVM 
algorithm has the potential to obtain 100% accuracy performance for the classification model. Logically in 
financial ratio data, doing this on a financial ratio probability dataset is very small in empirical data on 
economic variables [25]. The results of the statistical evaluation of the SVM model are shown in Table 3. 
Results of the SVM model get a classification or prediction accuracy of 81%, this indicates that the 
computational processing performance is good in modeling the existing data. 


Table 3. Results of the SVM model 
Precsion Recall Fl-Score Support 


Distress Zone 0.00 0.00 0.00 1 
Grey Zone 0.00 0.60 0.75 3 
Safe Zone 0.81 0.00 0.89 Ig 
Accuracy 0.81 21 

Macro AVG 1.00 0.87 0.88 21 
Weighted AVG 0.66 0.81 0.72 21 


3.3. Random forest (RF) model 

RF is a heuristic classifier that has a number of classification trees in the subsample data set and the 
average approach used so as to increase accuracy results and reduce over-fitting. Max_sample is a tool for 
controlling sample size. Sub-sample size, by initializing the max_sample bootstrap=True parameter value, 
the entire dataset will be used to construct the tree otherwise. RF algorithm will find a classification function 
that can be implemented in a multi-class financial ratio data set during the learning process. The evaluation of 


Int J Artif Intell, Vol. 11, No. 2, June 2022: 679-686 


Int J Artif Intell ISSN: 2252-8938 o 683 


the RF model is described in Table 4. The results of the evaluation of the RF model. Table 4 shows the 
classification target is multi-class data, while the results of statistical evaluation of the RFC model regarding 
the accuracy ratio show that the individual model has a predictive accuracy of 90%, where the evaluated 
values are accuracy, precision, recall, Fl-score. This shows a very good level of accuracy and is not overfit. 


Table 4. Result of the evaluation RFC model 


Precision Recall Fl-score Support 
Distress Zone 0.0 0.00 0.00 1 
Grey Zone 0.6 0.00 0.75 3 
Safe Zone 1.0 1.00 0.90 17 
Accuracy 0.90 21 
Macro avg 0.5 0.6 0.5 21 
Weighted avg 0.90 0.90 0.89 21 


3.4. Bagging ensemble learning (BELM) 

The new BELM model is a new classification method that combines the LRC, SVM, RF, and neural 
network (NN) algorithms that apply the hard voting method, by applying learning theory to train several 
models and combine their predictions. The idea is to combine predictions from different models that are good 
but different. The model built will produce different errors, due to the combination of different classification 
models and will also make different prediction errors in the data set [17] and can provide good classification 
results, map the original data to other dimensions by applying voting techniques, and having the concept of 
combining two or more models and trying to reduce the variance in predictions, ensembles can also produce 
better predictions than a single best model. A loud vote classifier is a collection of predictions from each 
classifier and the predictions are referred to as the most votes classifier or loud vote classifier. In the 
ensemble model, all classification algorithms can estimate the probability of a class, all have predictive 
probability methods, then we can define Scikit-Learn to predict the class with the highest probability, the 
average of the individual classifiers as a whole [11], [18]-[20]. 

The implementation of the BELM model on a balanced dataset using python is shown in Figure 2. 
Implementation of BELM. The results of the evaluation of the BELM classification model can be seen in 
Table 5. The results of the evaluation of the BELM model. Figure 2 shows the random_state initialization on 
the random_state parameter, this is applied so that the algorithm always gives the same output based on the 
same input. The best parameter value for penalty is ‘12’ and for C is 20000 in the LRC algorithm. 
Initialization of parameter n_estimator=10 in the RF algorithm indicates that we have built 10 decision tree 
predictions whose total value will be calculated later, while class_weight=’balanced’ is applied to process 
multilabel datasets. The kernel function initialization with RBF in the SVM algorithm is the best where the 
kernel function allows us to implement a model in a higher-dimensional space (feature space) without having 
to define a mapping function. In the artificial neural network algorithm, the activation function ‘identity’ and 
the solver ‘Ibfgs’ are the best settings for the layer hidden architecture. The BELM model makes new 
predictions by combining several models, each model producing a different estimator. The model with the 
most votes will have more power to influence the final decision. From the classification report, it can be seen 
that there is an increase in precision, recall, f-1, and accuracy. 

Referring to Table 5, an explanation of classification target is multi-class data, while the results of 
the statistical evaluation of the model get an accuracy rate of 97%, where the values evaluated are accuracy, 
precision, recall, Fl-score. This shows a very good improvement due to the modification of the ensemble. 
This model can provide a 7% to 16% improvement over the previous model. 


lr_clf = LogisticRegression(C=200080, penalty='12', random_state=RANDOM_SEED) 

rfc = RandomForestClassifier(n_estimators=10 ,class_weight = ‘balanced') 

svc_clf = SVC(C=10000.@, kernel='rbf’, random_state=RANDOM_SEED) 

rfc = BalancedRandomForestClassifier(n_estimators=10 ,class_weight = ‘balanced') 

ann_clf = MLPClassifier (hidden_layer_sizes=(5,10,3), activation='identity’, solver='lbfgs') 

bg = BaggingClassifier(RandomForestClassifier(),max_samples=0.5,max_features=1.0) 

bvelr = VotingClassifier(estimators=[(‘lr_clf',1r_clf),(‘svc_clf',svc_clf),('rf_clf',rf_clf), 
(‘bg',bg),(‘'ann_clf*,ann_clf)],voting='hard') 

evclr.fit(X_train,y train) 

evclr.score(X_train,y_train),evclr.score(X_test,y test) 


Figure 2. Implementation BELM 
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Table 5. Results of the BELM model evaluation 


Precesion Recall Fl-Score Support 
Distress Zone 0.94 0.96 0.95 16 
Grey Zone 1.00 0.90 0.95 10 
Safe Zone 1.00 0.00 1.00 17 
Accuracy 0.97 38 
Macro AVG 0.95 0.95 0.95 38 
Weighted AVG 0.94 0.95 0.95 38 


4. CONCLUSION 

This study was conducted to develop an integrated ensemble learning model with MLP 
hyperparameter optimization for bank bankruptcy prediction on financial ratio datasets, where the Altman 
Z-Score calculation model is the basis for classification, a combination of machine learning methods from 
several classifiers, and a combination of numbers from the classifier. The results of the proposed bagging 
ensemble combination model can improve accuracy, recall, precision, and FlScore by 7% to 16%, by testing 
using the same procedure. The classification results get 97% accuracy while producing a stable solution that 
can handle overfitting, can overcome class imbalances, biases, and determine the appropriate 
hyperparameters for the machine learning environment. This supports the results of the studies. We use a data 
set of financial ratios for the banking industry in Indonesia. By having reliable models and indicators, 
policymakers can anticipate the events of the company’s financial crisis. Finally, the proposed ensemble 
bagging can be applied as an alternative to predict the bankruptcy of the banking industry. For further 
research, the modified ensemble model was developed into an integrated internet of things (IoT) system, 
where the system can retrieve the necessary data related to financial ratios from financial statements issued 
by banks registered with Bank Indonesia. To get optimal performance from the modified ensemble model, it 
is recommended to compare it with other ensemble models such as boosting ensemble and stacking 
ensemble. 
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